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ABSTRACT 


1 .  INTRODUCTION 


Consider  a  context  with  p  +  1  random  variables  Y  and 
X  ■  (Xj,  ...  ,  X^)'.  Suppose  that  Y  is  viewed  as  a  dependent 
variable,  X^ ,  . . .  ,  X^  as  independent  variables  and  interest 
is  in  measuring  the  degree  of  association  between  Y  and 
X^ ,  . . .  ,  Xp  as  is  typical  with  a  multiple  correlation  parameter 


The  classical  multiple  correlation  coefficient 


Y*  X.  .  ..X 
1  P 


of  the  multivariate  normal  model  has  many  useful  properties  but 

it  lacks  robustness  (see  Ruber  (1977)).  Its  sample  estimate  is 

sensitive  to  outliers  and  heavier  tailed  distributions  and  can  be 

inefficient  for  nonnoraal  distributions.  An  alternate  measure  is 

needed  which  is  more  robust  in  such  situations. 

One  important  property  of  py-X  X  **  t*le 

i“*  p 

Pearson  correlation  between  Y  and  a  best  linear  prediction  of 

Y  from  X  in  the  sense  of  minimum  squared  error.  In  tnis  way 

pv.  y  v  is  directly  related  to  regression  concepts  in 

i  p 

interpretation  and  methodology.  This  property  can  be  retained  in 
defining  a  more  robust  multiple  correlation  coefficient  if  the 
correlation  measure  and  the  linear  predictor  are  replaced  by  more 
robust  choices.  This  paper  will  explore  such  a  measure  using  a 
linear  predictor  based  on  rank  estimates  of  regression  coeffi¬ 
cients.  The  measure  of  association  used  will  be  a  weighted 
Kendall1 s  tau  parameter  which  is  directly  comparable  with  the 
rank-regression  approach. 


Estimates  of  regression  coefficients  based  on  rank  statis¬ 
tic  s  have  been  developed  by  many  authors;  in  particular ,  see 
Jureckova  (1971),  Jaeckel  (1972),  McKean  and  Hettmansperger 
(1976,  1977)  and  Sievers  (1983)  for  some  of  the  basic  properties 
and  results  on  their  robustness  and  efficiency.  The  connection 
between  weighted  Kendall's  tau  statistics  and  rank  regression 
statistics  was  mentioned  in  Sievers  (1978). 

In  a  bivariate  setting,  Kendall's  tau  is  a  widely  used  non- 
parametric  measure  of  association.  Several  useful  extensions 
have  been  discussed  for  multivariate  settings;  see  Moran  (1951), 
Bobko  (1977)  and  Agresti  (1977).  This  paper  will  differ  by 
emphasizing  the  connection  to  the  corresponding  regression,  pre¬ 
diction  problem.  A  natural  population  parameter  will  be  used  to 
allow  for  a  direct,  meaningful  interpretation  of  sample  results. 
The  sample  estimate  should  be  highly  efficient,  in  contrast  to 
earlier  methods,  although  a  stronger  model  is  needed. 

The  basic  measure  of  association  treated  here  is  a  weighted 
Kendall's  tau.  The  weights  will  be  important  in  keeping  the  cor¬ 
relation  measure  directly  compatible  with  the  corresponding 
regression,  prediction  concepts  and  methods.  In  the  regression 
problem  it  is  known  that  weights  should  be  used  to  avoid  low  ef¬ 
ficiency;  see  Sievers  (1978),  Scholz  (1977).  Only  in  carefully 
designed  experiments  where  nonrandom,  equally  spaced  values  for 
the  independent  variables  can  be  set  would  the  weights  be 


unnecessary,  and  in  such  situations  multiple  correlation  issues 
are  usually  not  important. 

2.  THE  BIVARIATE  CASE 

This  section  considers  Che  bivariate  case  to  introduce  some 
ideas  and  motivate  the  main  definition  to  follow.  Consider  a 
pair  of  random  variables  (7,  X)  with  a  nondegenerate  bivariate 
distribution.  .Let  (Y^ ,  X^)  and  (Y2>  X^)  be  independent  with 
the  same  distributions  as  (Y,  X).  A  widely  used  nonparametric 
measure  of  association  is  Kendall's  tau 

T  -  E(sgn(X2-X1)  sgn(Y2-Y1)),  where  sgn(t)  *  -1,  0,  1  as 
t  <  0,  *  0,  >  0.  The  value  of  x  is  in  [-1,  1].  Following  by 
analogy  the  Pearson  correlation,  one  could  take  the  absolute 
value  to  obtain  a  multiple  correlation  coefficient  although  it  is 
not  clear  how  useful  this  could  be. 

The  Kendall  tau  is  symmetric  in  the  role  of  X  and  Y. 

However,  in  the  multiple  correlation  context  the  variables  should 

be  treated  asymmetrically,  with  Y  and  X  playing  the  part  of  a 

dependent  and  independent  variable,  respectively.  This  would 

relate  multiple  correlation  concepts  more  directly  to  regression, 

prediction  concepts  as  is  familiar  with  the  classical 

PV#Y  ,  .  Moreover,  in  the  regression  problem  it  has  been 

i  p 

noCed  in  Jaeckel  (1972),  Scholz  (1977)  and  Sievers  (1978)  ChaC 


« 


i, 


I 
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Che  use  of  weights  depending  on  X  is  needed  Co  obtain  high 
efficiency  in  the  nonparametric  procedure  based  on  Kendall's  tau. 

These  considerations  motivate  a  definition  of  a  correlation 
coefficient 

x*  -  E( |X2~X1 |  sgn(X2-X1)  3gn(Y2-Y1))/E( |x  -X  |) 

-  E((X2-X1)  sgn^-Y^J/E^Xj-Xj),  • 

where  in  the  first  form,  the  numerator  is  a  weighted  Kendall's 
tau  and  the  denominator  is  a  suitable  norming  factor.  The  use  of 
differences  here  is  natural  for  parameters  based  on  rank  order. 

It  is  worth  noting  that  the  product-moment  correlation  coeffic¬ 
ient  can  also  be  expressed  in  terms  of  differences  as 
P-  E((X2-X1KY2-Y1>>/IE(X2-X1)2E(Y2-Y1)2]1/2.  Thus  x*  is  "in 
between"  p  and  Kendall's  tau  by  replacing  one  of  the  variables 
Y2-Yx  by  sgnUj-Y^. 

The  parameter  x*  has  several  desirable  properties: 

|x*|  <  1,  x*  is  invariant  under  linear  transformations  of  the 
variables,  x*  *  0  if  X  and  Y  are  independent,  x*  *  1  if 

Y  is  a  linear  function  of  X  with  probability  one.  Also  if  X 
and  Y  have  a  bivariate  normal  distribution  with  correlation  P 
then  x*  ■  p  .  These  properties  will  be  discussed  in  more 
detail  in  the  multivariate  case  in  the  next  section. 
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The  definition  of  x*  above  does  not  lend  itself  readily  to 
an  extension  to  higher  dimensions  and  it  is  not  suitable  for  a 
multiple  correlation  since  x*  can  be  negative.  The  following 
change  in  the  definition  will  allow  a  natural  extension.  Let  0^, 
be  a  value  of  6  minimizing  £(|(Y2-Y^)  -  S(X2-X^)|).  Then  define 

x  -  E(B *(X2-Xj)  sgn(Y2-Y1>)/E(  |  B^X^X^  | ) 

if  0*  0  and  x  «  0  if  0#  *  0.  Factoring  out  6*,  it  follows 

that  x  *  sgn(0#)x*,  so  there  is  at  most  a  sign  difference 
between  x  and  x*.  Later  it  is  shown  that  x  is  nonnegative. 


3.  THE  MULTIVARIATE  CASE 

Consider  random  variables  Y  and  X  a  (X, ,  ...  ,  X  )  1  . 

-  1  P 

Assume  they  have  finite  expectations,  but  otherwise  their  distri¬ 
bution  can  be  quite  arbitrary  for  some  of  the  material  in  this 
section.  Of  special  interest  here  is  the  model  that  specifies  the 
joint  cdf  of  Y  and  X  to  be  of  the  form 


F(y  -  Slx)H(x), 


(3.1) 


where  F  is  a  univariate  cdf,  H  is  a  p-dimensional  cdf  and 

■  ( 0 n i >  •••  »0a  )*  is  a  vector  of  unknown  parameters.  In 
*0  U i  Up 

this  model  the  conditional  cdf  of  Y  given  X  *  x  is 
F(y  -0^x).  This  property  appears  in  the  multivariate  normal 
model,  but  here  the  F  is  not  assumed  normal.  No  symmetry  or 
centering  assumptions  are  made  on  F  or  H.  Alternately,  this 
model  can  be  expressed  as 


i 
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Y  -  g^X  ♦  e. 


(3.2) 


where  X  has  cdf  H,  e  has  cdf  F  and  X  and  e  are 
independent • 

Considerations  in  the  bivariate  case  lead  to  the  following 
definition  of  a  multiple  correlation  parameter.  Let  (Y^,  X^) 
and  (Y£,  X^)  be  independent,  each  having  the  distribution  of 
(Y,  X).  Suppose  *  (6*i>  •••  >  6*p) 1  minimizes 

EtjU^Yj)  “  fi.'tXj-Xj)!!  -  E[|(Y2  -  £'X2)  "  (Yx  -  6/X^)  |  ]  (3.3) 


as  a  function  of  S.  *  Then  define  a  multiple  correlation  para¬ 
meter  by 

'if  S.k«2k*  V  *»"(Y2-V1 

k-1 

T  »  - * - 

"if  ‘A-vn 

k=l 

(3.4) 

E  ~  sSn(^2  _  Yl)] 

e  [ |e;  (x2  -  xx) | ] 


if 

8*  *  £ 

and  let  x  « 

0  if 

L 

-  0. 

In  the  notation 

here 

X. 

— i 

II 

X 

...  ,  xip)', 

i  ■  1. 

2. 

Note 

that  *  (3.3)  is 

a  convex 

function  of  In  most  cases  of  practical  interest  6^  will  be 

unique.  For  ambiguous  cases  x  will  be  left  undefined. 


•  * 
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Note  that  t  is  defined  as  the  weighted  Kendalls'  tau  as 
modified  in  Section  2  for  Y  vs  The  linear  function 

can  be  viewed  as  a  best  linear  predictor  of  Y  in  the  sense  of 
minimizing  the  variation  in  Y-  S/JC  as  measured  by  the  absolute 
difference  of  two  independent  copies.  Recall  that  if  and 

z^  are  independent  copies  of  a  random  variable  z,  then 
E C | z^^z^ | )  measures  the  variation  in  z  (a  Gini  mean  dif¬ 
ference  parameter).  Being  of  first  order,  this  will  be  less 

sensitive  to  contamination  and  heavy  tails  in  the  distribution  in 

2 

comparison  to  the  square  function  E((z^-z^)  )  *  2  var(z)  used 
in  the  classical  approach.  (3.3)  is  the  population  analog  of 
the  dispersion  function  used  in  Sievers  (1983). 

Remark  3.1.  Assume  model  (3.1).  Let  G  denote  the  cdf  of 
the  difference  of  two  independent  random  variables  each  having 
cdf  F  and  assume  G  has  a  unique  median.  Then  is  the 

unique  point  minimizing  (3.3)  and 

t  -  •gn(Y2-Y1)l/K[|6^(X2-Xi>|]. 

Proof.  Under  model  (3.1)  the  conditional  distribution  of 
W  -  Y2-Y]L  given  *  -  has  cdf  G(w  *  11113 

distribution  has  a  unique  median  of  since  G  has  a  unique 

median  by  assumption  and  its  value  is  0  from  W  being  sym¬ 
metrically  distributed  about  0.  It  is  well-known  that  the 


median  minimizes  an  expected  absolute  deviation.  Thus  for  each 
fixed  _t,  E[  |W  -  a|  1 1]  is  minimum  if  a  *  and  the  result 

follows.  m 

Remark  3.2.  If  Y  and  X  are  independent,  then  t  3  0. 

Proof.  A  conditional  argument  as  in  the  previous  proof 
shows  that  6_  *  JD  minimizes  (3.3),  although  it  may  not  be  uni 
Regardless,  independence  implies  that  the  numerator  of  t  ti 
tors  and  the  result  follows  from  E( sgnCY^-Y^ ) )  =  0  by 
symmetry.  m 

The  following  remark  shows  an  important  property;  that 

x*  0  is  equivalent  to  Y  and  X  being  independent  in  model 

(3.1).  The  classical  parameter  p  Y  has  this  property  for 

i  p 

the  multivariate  normal  model  but  not,  in  general,  for  nonnormal 
cases . 

Remark  3.3.  Assume  model  (3.1)  holds  with  X  having  a  non¬ 
degenerate  distribution.  Then 

t  *  0  <"■"">  ■  0  «>y  and  X  are  independent . 

Proof.  Because  of  the  form  assumed  for  the  joint  cdf  of  Y 
and  X  in  model  (3.1),  Y  and  X  are  independent  if  and  only 
if  S_q  *  ^).  If  3  then  T  3  0  by  definition.  It 

remains  to  show  that  f  £  implies  T  +  0. 


/ 
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Assuming  6^  f  0^  T  *  8^— 2~— 1^  ^as  a  nondegenerate  dis¬ 
tribution  with  cdf  say  L(t).  Let  W  *  Y2~Y1  *  Un<ier 
model  (3.1)  the  conditional  cdf  of  W  given  T  *  t  is 
G(w-t) .  The  numerator  of  x  is 

E(T  sgn(W) )  =  E [ T{  P( W  >  0 | T)  -  P(W  <  0 | T) } ] 

«  E [ T ( 1  -  2G(-T))] 

*  E[T(2G(T)  -  1)] 

-  2  E  ( TG  ( T ) )  , 

using  G(t)  +  G(-t)  *  1.  Then  since  T  is  symmetrically  distri¬ 
buted  about  0,  this  equals 

00 

2  J  t[2G(t)  -  l]dL(t) . 

0 

The  integrand  is  positive  on  the  range  of  integration  and  with  T 
having  a  nondegenerate  distribution  the  integral  is  positive. 

Thus  i  t  0  as  was  to  be  shown.  ■ 

Remark  3.4 .  0  £  x  <  1 . 

Proof:  The  upper  bound  follows  from 

lE(6JrU2-Xi)  sga(Y2-Y1))j  £  E(  sSn^2"Yl)l) 

-  E(  |)- 


I 


10 

For  the  lower  bound,  it  is  enough  to  show  the  numerator  of  t  is 
nonnegative.  Let  W  *  and  *  1^'  Tfien  ®ince  j3* 

minimizes  (3.3),  write 

o<^E(jw|)  -e(|w  ~t|)  *e((w|  -  |w~t|) 

-  /  t  +  /  (2w-t)  +  /  (t--2w)  +  /  (-t) 

w>0  w>0  w<0  w<0 

t<w  t>w  t<w  t>w 

_<  f  t  +  J  t  +  j  (-t)  +  /  (-t) 

w>0  w>0  w<0  w<0 

t<y  t>w  t<w  t>w 

*  J  t  +  J  (-t)  =  /  t  sgn(w) , 

w>0  w<0 


where  for  simplicity  the  differential  part  of  the  integrals  was 
omitted.  This  last  expression,  the  numerator  of  t#  is  thus  non¬ 
negative.  m 

Remark  3.5.  If  Y„-Yf  and  SI(X^-X, )  have  the  same 
-  2  L  —2  —1 

sign  with  probability  one,  then  T  ■  +  1. 

Proof.  Let  W  -  and  T  ^ 

hypothesis  implies  sgn(W)  *  sgn(T)  with  probability  one.  Then 
the  numerator  of  x  is  E(T  sgn(W))  *  E(T  sgn(T))  »  E()T|)  which 
is  the  denominator  of  t  ,  • 


11 


The  following  remark  shows  that  for  the  multivariate  normal 
model,  x  is  identical  to  the  classical  multiple  correlation 
coefficient  p  .  Thus  ic  would  share  its  many  useful 

X  A«  •  •  •  A 

1  P 

properties  for  this  model. 

Remark  3.6.  If  Y  and  X  have  a  multivariate  normal  dis¬ 
tribution,  then  x  *  py*X  X  * 

i*  p 

Proof.  If  Y  and  X  have  a  multivariate  normal  distribu¬ 
tion  then  model  (3.1)  holds  with  being  the  vector  of 

least-squares  regression  coef f icients .  It  is  well-known  that 

v  „  is  the  (Pearson)  correlation  coefficient  of  Y  and 
Y*  X.  . .  .X 

i  p 

S'X.  This  is  Che  same  as  Che  Pearson  correlacion  coefficienC  of 
— 0— 

Che  differences  W  *  and  T  *  Thus  W  atvd 

T  have  a  bivariate  normal  disCribuCion  wich  zero  means  and  cor¬ 
relacion  p  .  Ic  is  scraighcforward  Co  show  chac 

V  p 

E(T  sgn(W) )  -  py>x  and  E(|T|)  -  Oy/lF  , 

1  ’  '  P 

where  is  the  standard  deviation  of  T,  and  the  results 

follows.  9 

Remark  3.7.  t  is  invariant  under  nonsingular  linear 
transformations  of  Y  and  X. 


Proof,  t  depends  on  Y  through  a  signed  difference  and  it 
is  clear  that  a  linear  transformation  of  Y  would  have  no  ef¬ 
fect.  If  X  is  replaced  by  CX,  where  £  is  a  p  x  p  non¬ 
singular  matrix,  then  the  minimizing  (3.3)  changes  to 
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IC’)"1!*.  Substituting  in  (3.4),  ( (C*  )“lB*) •  (CX. )  -  B!X.  , 

—  —  — m  —1  -1 

i  ■  1,  2,  and  no  change  in  t  would  occur,  ■ 


4.  SAMPLE  ESTIMATE  OF  T 

Let  (Y.  ,  X.),  (Y4>,  X0),  . ..  ,  (Y  ,  X  )  be  independent 
i  —a  i.  — i.  n  — n. 

replicates  of  (Y,  X) ,  where  JL  *  (Xil ,  .  . .  ,  X^) 1  ,  1  <  i  <  n, 

and  X  m  (X.,  ...  ,  X  )'.  Define  an  n  x  1  vector 
“  1  P 


I  -  (Yj,  ...  ,  Yn)', 

an 

n  x  p  matrix  A  ■  (X^), 

a  parameter 

vector  6^  -  (6^,  .. 

•  > 

8n  ) 1  and  an  error  vector 
Op 

—  *  (ei >  • • •  »  en^ ' • 

If 

(Y,X)  satisfys  model  (3.2), 

then 

Y  *  A  +  e,  (4.1) 

where  the  elements  of  £  are  iid  with  cdf  F,  the  rows  of  A 
are  iid  with  cdf  H  and  A  is  independent  of  e^.  An  intercept 
parameter  could  be  added  to  thisr  model  but  the  procedures  here  are 
based  on  differences  and  it  would  cancel  out  and  have  no  effect. 

An  estimate  of  ^  can  be  defined  in  a  natural  way  as  fol¬ 
lows.  First  let  6  *  (8  #  ...  ,  6  ) r  be  a  vector  that  minimizes 
-  1  P 

a  dispersion  measure  of  the  residuals  given  by 


D<£>  -  I  |<Y.  -  Y.)  -  I  6k(x.k  -  X.k) 

i<j 


ft 

k-1 


-  I  |(Y  -  S'*  )  -  <Y.  -  B'x.)|.  U.2) 

J - J  i - 1  1 

i<j 
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Then  define  an  estimate  of  t  as 


T 


I  Ik  ®k‘Vxik> 

in _ 


III,  Wxik> 

i<j 


I  i.’  sgn(Yj-Yi) 

-  ill _ 

.  I  li’QLj-Xt)1 
i<j 


(4.3) 


if  f  jJ  and  x  *  0  if  8_  ■  0^ 

The  dispersion  function  (4.2)  is  a  convex,  piecewise  linear 
function  of  and  as  a  result  there  will  be  a  point  attaining 
the  minimum,  although  it  may  not  be  unique.  This  is  the  same 
dispersion  function  used  in  Sievers  (1983)  and  is  algebraically 
equal  to  the  dispersion  function  in  Jaeckel  (1972)  and  in  McKean 
and  Hettmansperger  (1976,  1977)  when  Wilcoxon  scores  are  used. 
These  references  point  out  that  the  diameter  of  the  set  of  point9 
attaining  the  minimum  tends  to  zero  asymptotically .  Further,  J3 
is  the  rank  estimate  of  the  regression  scores  8^  and  these 
references  contain  further  results  on  properties  of  6,  computa¬ 


tional  methods  and  more. 
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The  estimate  t  has  the  following  properties: 

A  A  A 

0  <  i<  1,  t*+1  if  the  rank  order  of  the  fitted  values  A 

is  the  same  as  the  rank  order  of  and  t  is  invariant  under 
nonsingular  linear  transformations  on  Y^  and  X.^* 

The  estimate  r  can  be  expressed  in  another  form  to  view  it 
more  explicitly  as  a  rank  statistic.  First  note  the  formula 

l  (X..  -  X..)  sgn(Y.  -  Y.) 

L  jk  lk  *  j  l 

i<j 

-  I  Xik(2  Si  "  (n+1))  "  2  l  (Xik  “Xk)Si*  (4'4) 

i 

i 

where  S.  is  the  rank  of  Y.  among  Y.  ,  ...  ,  Y  and 

l  i  *  1  *  n 

*  I  ^ik^n*  Using  this,  the  numerator  of  x  is 
i 

2  l  K  l  <xik  -  \)si  ■  2  i'iei  ■  2  i‘i- 

k  i 

where  *  (S^,  ...  ,  S^)1,  JL  *  A^  is  the  vector  of  centered 

fitted  values  and  A  *  (X.t  -  X,  )  i9  the  centered  A  matrix. 

— c  lk  k  nxp  — 

(Alternately  the  rank  vector  could  be  centered.)  Writing  the 

.denominator  of  t  as  £  £  8R(X^k-Xik)  sgn([R  (Xjk"Xik) } 

i<  j  k 

and  applying  the  same  method  gives 

x  *  Y 1 S  /  Y'S, 


(4.5) 
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where  S  ■  (S„,  ,  S  )  is  the  rank  vector  of  Y. 

—  l  n  — 

At  At  A 

Thus  the  numerator  of  t  is  cov(j[f  S)  and  the  denominator 
is  cov(]f,  SK  The  covariance  of  with  a  permutation  of  the 
integers  (1,  ...  ,  n)  is  maximum  when  the  integers  are  in  the 
same  order  as  the  elements  of  see  Jaeckel  (1972).  Thus  the 

denominator  is  the  maximum  covariance  of  Y^  with  a  rank  vector. 
This  supports  the  choice  of  denominator  in  x  ,  verifys  1  1, 

and  shows  t  ■  +  1  when  JT  and  _Y  are  in  the  same  rank  order. 

The  formula  (4.5)  suggests  an  interesting  generalization  to 
allow  arbitrary  scores  instead  of  ranks.  Simply  replace  the  rank 
vectors  and  by  the  corresponding  permutations  of  a  vector 
of  nondecreasing  scores  (a^,  ...  ,  a^) .  It  appears  that  such  a 
statistic  would  have  the  same  properties  as  x  .  This  will  be 
discussed  in  a  subsequent  paper. 

5.  CONSISTENCY  OF  x 

In  this  section  x  is  shown  to  be  a  consistent  estimate  of  x 
under  model  (4.1)  with  some  additional  regularity  conditions: 

(Cl)  The  cdf  F  has  an  absolutely  continuous  density  function 
f  with  J(f,/f)^fdx<»  , 

(C2)  The  difference  of  two  independent  random  variables  with 
cdfs  F  has  cdf  G  and  density  function  g  which  is 
continuous  at  zero,  g(0)  >  0, 
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(C3)  The  random  vector  X  has  a  positive  definite  variance- 
covariance  matrix 

(C4)  There  exists  a  positive  6  such  that 

E  [  (X-y_) '  (X-u_)  ] 2+6  <  •  ,  where  V  -  E(X) . 

Some  additional  notation  will  be  needed  for  the  proofs  of 

this  section.  Define  T.(B)  *  £  (X.-X..)  sgn[Y.-Y.  -  B'(X.-X.)] 

k—  jkik  ji  —  —  j  —  i 

i<j 

and  let  T(&)  x  ( T^  (6^)  ,  ...  ,  Also  let 

L(6_)  *  l  IS/Uj-Xj  | .  Let  A*  -  (i/27)n“3/2  £  _1T(0) ,  where 

i<j 

Y-  Jf2. 

LEMMA  5.1.  Assume  model  (4.1)  and  conditions  Cl  -  C4. 

Then  if  *  £, 

(i)  n_3/2T(0)  N(0,  ( 1  / 3 )£  ), 

(ii)  £  -  £*  -t*  £  ,  where  £  ■  £,  and 

tiii)  £-^N(0,  ( 1/  12Y2)£_1 ) . 

Note  that  when  holds,  ^n(B^  -  6^)  has  the  same  distribution 

as  A  when  8^  ■  £  and  thus  the  limiting  distribution  of 

(iii) . 

Proof.  The  above  results  were  given  in  Sievers  (1983)  for 

the  case  of  nonrandom  X...  The  assumptions  A1-A8  of  that 

% 

paper  will  hold  almost  everywhere  in  the  present  context  if 


* 
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max  |Xik  -Xj^J/fn  — ►  0  a,e.  ,  for  1  £  k  <  p,  and 
Ki<n 

(l/njA^A^ — a.e.  as  n  -►  »  .  But  these  follow  from  conditions 
C3  and  C4  and  Lemma  4.1  of  Ghosh  and  Sen  (1971).  ■ 

THEOREM  5.1.  Assume  model  (4.1)  and  conditions  C1-C4. 

Then  t  — ^  x . 

Proof:  First  consider  the  case  6^  f  £.  Express  x  in  the 

form 

x  -  (£■  T(0)/M)/U(£)/M),  (5.1) 

where  M  *  (^) .  Similarly  write 

x  ■  (6i  u*(Ln))/v»  (S_)  * 

—0  —0  o 

where  ^*(6^)  is  a  p  x  1  vector  with  kth  element 

El(X2k"Xlk)  ^VV1  and  U(^0)  "  Ellio(i2"ii)l1- 

A  p 

From  Lemma  4.1,  it  follows  that  S_ — ►  S^.  The  vector 

T(0)/M  is  a  vector  of  U-statistics  which  converges  in  probability 

to  v*<^>  by  the  usual  theory.  For  the  denominator  of  (5.1), 

*  P 

note  that  (L(6_)  -  L(S^))/M — *  0  since  it  is  bounded  above  in 
absolute  value  by  ^max  |6 ^  -  BQk|  IkIi<;jlXjk  ”Xikl/M  and  the 

latter  converges  to  zero  in  probability.  But  L(8^)/M  is  a 
U-statistic  converging  in  probability  to  y (6^).  It  follows  that 
T  -L-  T  in  case  +  (). 


•  t 


18 


Now  consider  the  case  „  The  above  argument  does  not 

apply  since  both  numerator  and  denominator  of  t  tend  to  zero 
and  it  is  necessary  to  deal  with  the  rates  of  convergence.  First 
express  (5.1)  in  terms  of  A  ■  ^n  £  as 


T  -  (A'T(£)/M)/U(A)/M).  (5.2) 

From  Lemma  5.1  (iii),  A,  is  0^(1)  and,  as  above, 

P  "  P 

T(£)/M  - -  £*(£)  *  Cl.  Thus  £  - ►  £  if  it  is  shown  that  the 

denominator  of  (5.2)  is  bounded  away  from  zero  in  probability. 

To  show  this  let  G^  *  {£  €  RP:  ||aJ|  A  6}  for  6  >  0, 
where  is  p-dimensional  Euclidean  space  and  ||  *||  the  usual  dis¬ 
tance.  Let  the  boundary  be  9  {£  €  Rp  :  1 1  Al  I  }  •  By  Lemma 
5.1  (iii),  P(£  g  G^)  can  be  made  arbitrarily  close  to  one  for 
all  n  sufficiently  large  by  taking  <5  sufficiently  small.  Now 
for  any  fixed  X.,  ...  ,  X  ,  L(A)  is  nonnegative,  convex, 

L(£)  *  0  and  so  for  any  £'  €  G^  there  exists  A  €  such  that 

L(A)  <  L(£' ) .  Thus  if  A  6  Gg ,  L(A)/M  >  infA  ^  L(A)/M  and 

—  5 


it  will  be  sufficient  to  show  the  latter  is  oounded  away  from 
zero  in  probability. 

To  accomplish  this  a  c ompact if ic at ion  argument  can  be  used, 
is  a  compact  set.  For  any  £,  A*  €  can  shown 

that  |L(A)  -  L(A/)|/M  <  ||  A  *  A'  1 1  v»  where  V  -  Z  |  Xjk~xikl 


converges  in  probability. 


Also  L(£)/M  — ^u(A^  f°r  an7  fixed 


poinC  and  therefore  uniformly  for  any  finite  set  of  points  A.. 

Finally,  use  the  fact  that  inf^  ^  u ( A)  >  0,  since  uOl)  is 

—  6 

nonnegative,  convex,  u(0)  *  0  and  y(M  =  0  for  some  ^  £ 
would  contradict  the  assumption  of  a  positive  definite  ^ 
matrix.  ■ 

6.  A  TEST  OF  INDEPENDENCE 

In  this  section  a  test  of  the  hypothesis  of  independence  is 

considered  for  model  (4.1).  In  view  of  Remark  3.3,  this  is  the 

hypothesis  :  t  *  0  (or  »  0^)  .  The  test  will  be  based  on 

the  numerator  of  T  ,  viewing  its  denominator  as  basically  a 

norming  factor.  The  distribution  theory  for  the  numerator  of  T 

is  readily  available  from  the  results  of  Section  5. 

From  (4.3)  and  (4.4),  the  numerator  of  T  is  S^(j))  *  2 

where  ■  A  S  is  the  centered  vector  of  fitted  values  and  is 

the  rank  vector  of  Y.  The  proposed  test  of  the  hypothesis 

2 

H0:  1  *  0  vs  :  1  >  0  is  to  reject  HQ  if  Q  >  X^p,  where 

"  A  A  ,»  2 

Q  *  (12Y/n)  Y'S>,  Y  is  a  consistent  estimate  of  Y  *  Jf  (see 

McKean  and  Hettmansperger  (1976,  1977),  Sievers  and  McKean 
2 

(1983))  and  -  is  the  quantile  of  order  1  -  ex  of  a  chi- 

u ,  p 

square  distribution  with  p  degrees  of  freedom. 

THEOREM  6.1.  Assume  model  (4.1)  and  conditions  C1-C4. 

Then  under  ,  Q  has  a  limiting  chi-square  distribution  with  p 


degrees  of  freedom. 
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Proof.  It  is  sufficient  to  replace  y  by  y  and  consider, 
with  notation  from  Section  5, 

~  A  -3/2  2  ~  m 

(12Y/n)  Jf '_S  -  6  YA'[n  '  T(0)]  -  12y  A'lfl*.  (6.1) 


Using  Lemma  5.1,  this  has  the  same  limiting  distribution  as 
12Y^  A  *EA  t  which  is  X^(p).  ■ 


McKean  and  Hettmansperger  (1976,  1977)  have  proposed  a  test 
of  the  equivalent  hypothesis  *  j)  based  on  a  drop  in 

dispersion  for  the  case  of  fixed  .  In  the  notation  here,  this 


statistic  is  (12y/n)(D(£)  -  D(£)),  where  D  is  given  in  (4.2). 
The  asymptotics  of  Section  5  can  be  used  to  show  this  statistic 
is  asymptotically  equivalent  to  Q  and  in  this  sense  there  is 
agreement  between  the  tests  of  T  *  0  and  «  £.  Another  test 
statistic,  asymptotically  equivalent  to  Q,  arises  by  replacing 
A^  by  Aj  in  (6.1),  namely  3n  ^T(0)  *T(  0^  .  This  statistic 

has  the  advantage  of  not  requiring  an  estimate  of  the  scale  para¬ 


meter  y . 
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